The Future of Storage: NVMe and 3D X-Point

A survey paper by Trever Wagenhals

Abstract:

The purpose of this survey paper is to discuss the advancements of storage interfaces, the limitations of each interface, and how these evolutions are all leading to the implementation of the NVMe standard and 3D X-Point. The potential performance and architecture of 3D X-Point and NVMe devices in comparison to older technologies are also discussed.

Starting in February 2016, I began an internship at Teradyne, a company that primarily focuses its development on semiconductor testing. Teradyne, however, also has a subdivision in the company that focuses on asynchronous automatic hard drive testing machines. As of now, Teradyne sells various sized machines that are capable of testing Serial Attached SCSI (SAS) and Serial Advanced Attachment (SATA) type hard drives. With newer technologies emerging, however, Teradyne must also shift their focus towards the new Non-volatile memory express (NVMe) device form factors and methods to test them. As an intern, I have been researching methods to hot-plug Peripheral Component Interconnect express (PCIe) based cards and have had the privilege to benchmark and test some of the fastest drives on the market. Through this internship, my interest in various form factors and architectures of hard drives has been peaked. Having spent some time working closely with the subject, it is one that I feel many are not fully aware of hard drive advancements and have not followed the need for advancements in the storage field, and so I am writing this paper to discuss the history of storage interface protocols. This discussion will include the outdated Parallel ATA (PATA) interface, the SATA interface, and the newly emerging NVMe specification in more detail. I also wanted to vaguely discuss the future of storage devices and how new hard drives will eventually make their way to the memory bus under 3D X-Point technology.

As technology develops over the years and systems become more integrated, the need for an increased bandwidth arises to allow for a continuous flow of data communication. This idea has been the main motivation for the majority of progress in technology. Impressively, for the past several decades, the bandwidth of the central processing unit (CPU) has been able to double approximately every 8 months [1]. Although impressive, these large advancements have always experienced a bottleneck factor due to the bandwidth memory has been able to provide. Since the 1980s, storage interface protocols have been implemented and developed to try to match the rate of CPU development and minimize the bottlenecking factor. PATA was the first standard introduced in the 1980s and quickly became the standard for interfacing to disk drives. When originally launched, PATA allowed for speeds upward to 3 Megabytes per second (MBps). Since its original launch, PATA has been able to introduce newer protocols to be able to push the limits upward to as much as 133MBps [1].

During the time, having a protocol that allowed a bandwidth of 133MBps was not a limiting factor, until hard drives began to develop, allowing these speeds to be achieved. The developments in cache size, revolutions per minute (RPM), firmware, and controllers have allowed newer hard drives to begin to push these speeds and cause the PATA protocol to begin the limiting factor. Although it may seem plausible to simply further develop the PATA protocol to allow for higher throughput, PATA suffers from many constraints itself.

Ultra ATA, the PATA protocol utilized to allow up to 133MBps throughput uses a traditional non-locking clock signal, which allows for broadcasting delay to be avoided. With bidirectional communication, 2 bytes per clock can be transmitted. Operating at 50 Megahertz (MHz), the protocol is able to operate at a capacity of 100MBps. To increase throughput even further, frequency is required to be increased; however, because the frequency is already at the edge of skew and action delay, the frequency cannot be increased much more. Also, with the development of integrated circuits (ICs) only needing 3.3V signal, efficiency of PATA power and ease of integration demanded a development in storage interface protocol. Lastly, the high base pin count for ATA interfaces increased the difficulty to design chips and wire main boards [1].

Another protocol that was designed during the 1980s, just like PATA, is the small computer system interface (SCSI). The SCSI protocol targeted the idea of servers and businesses more than PATA, which tended to lean more towards consumer computers. Just like PATA, SCSI uses parallel communication, but this is probably the only place similarities can be found. Because PATA was designed for consumer level use, it tended to only support up to two devices per system, meaning the application in data centers was not very plausible. SCSI, on the other hand, supported a total of 16 disks connected at once, making it much more effective in these scenarios. Another major advantage that SCSI had over PATA connections is that it supported up to 320MBps, which is almost 3 times as large as PATA [2]. It is because of this higher throughput that SCSI drives tended to be designed at a higher RPM, allowing for faster drives. Although SCSI seemed to beat PATA performance-wise, the complexity of integrating the protocol as well as the much higher price point made the decision less clear than one would assume. Also, the fact that the 320MBps was designed to be shared among all connected devices meant that, although it was designed to interface with so many devices, they would suffer huge performance drops if all devices were ever used at the same time.

Because of all of the previously mentioned issues that PATA experienced, SATA was developed by Intel Company in an attempt to alleviate the bottleneck produced. With the new protocol, the first released version of SATA in 2001 allowed for a throughput of 150MBps. The new protocol only utilizes 4 pins in an ideal state; electricity, ground, transmit, and receive, as opposed to the minimum 26 pins needed by the PATA standard. Because SATA also utilizes 3.3V signal, it is designed to be more power efficient and integrate with newly developed ICs more easily. By the second revision, SATA supported a throughput of 300MBps, and up to 600MBps by the third revision [1].

Several years later in 2004, another protocol, Serial attached SCSI (SAS) was designed to attempt to target the issues faced by SCSI. The first edition of SCSI supported up to 300MBps per device. This number represents a slightly smaller maximum throughput for a signal device than the old SCSI protocol, but because these speeds are available for each device, adding multiple devices does not start to bottleneck the system. The SAS protocol was also designed with the newly designed SATA standard in mind; the SAS connectors were made to support both protocols so that the transition for businesses was much easier. If a company wanted to upgrade their framework before switching their SATA drives to SAS drives, it was something that could be done, giving companies even more flexibility [2]. SAS drives also tend to support much higher RPM than standard SATA drives, meaning that the performance from these drives was generally greater. Where a SATA drive typically can go at 7,200 RPM, SAS drives are designed up to 15,000 RPM. Today, SAS currently supports up to 12Gbps, twice that of SATA.

Since SAS and SATA were developed, the issue of bottleneck speeds was once again turned away from the interface protocol and the focus was placed on to the hard drives. With hard drives developing upwards to 15,000 RPM, newer software being written, and larger amounts of capacity on a single drive, the traditional mechanical hard drive made numerous improvements in efficiency and performance. Although all of these accomplishments were made, mechanical disk drives still did not begin to come close to the maximum throughput allowed by SATA 3.0. Even the highest grade enterprise hard drives that were being designed for server communication at the time could only produce a throughput of less than 250MBps. The physical limitations of vibration, life cycle depletion, and heat from increased RPMs once again demanded a change in standard, and this time it was on the drive design instead of the interface protocol. It was because of these limitations that the Solid State Drive (SSD) was born.

SSDs are drives that consists of millions of flash memory cells that have non-volatile characteristics to them to ensure data is not lost between uses. Structurally, flash memory cells are comprised of a single transistor that either includes or does not include a contact. A contacted transistor is classified as NOR type, which has the ability to perform random accesses. On the other hand, the lack of a contact is considered a NAND type, and does not contain the random access capability [3]. Although NAND technology was introduced in 1988 by Intel, and NOR technology was introduced in 1989 by Toshiba [4], it was not until the last decade that these technologies were designed to be implemented into storage.

With two technologies to consider when designing the SSD concept, the pros and cons had to be weighed to determine which to advance with. Although NAND type lacks the ability for random accesses, it has quickly become the dominating design due to several reasons. NAND technology allows a better ability to scale in terms of capacity, which also results in a lower cost associated with expandability. NAND technology also has a much faster write performance and a significantly faster erase performance. Because of these reasons, combined with the fact that NAND technology allows ten times as many erase cycles as its NOR counterpart, it has quickly become the choice for high capacity data storage [3].

In the early stages of SSD development, the cost per gigabyte of data was so incredibly high that the concept was not commercially available. It wasn’t until breakthroughs, such as multi-bit per cell technology, that SSDS began to be commercially affordable. Multi-bit per cell allowed two-bit or three-bit per cell devices to be developed, dramatically increasing storage capacity and lowering price [3]. With SSDs slowly starting to make their way to the market, the sheer performance differences were immediately noted. Solid state drives were now able to push throughputs upwards to 550MBps for read and write speeds, easily doubling the throughput seen from high end mechanical disk drives. Aside from just throughput, SSD devices also received a huge boost in latency timings. Traditional mechanical drives typically experienced 20-30ms of latency between IO tasks, while SSDs dramatically dropped the latency down to 30-50us [4]. This change resulted in a latency increase by a factor of about 100 times compared to older drives, making SSDs even more desirable.

To this day, many companies are slowly integrating SSD devices into their framework to allow for faster data handling and maximum throughput. Companies such as banks, Wall Street, and other fast paced data transfer centers have implemented these devices in order to maximize their performance. Now that SSDs have hit the commercial market and have decreased the gap between the throughput limits of the SATA interface protocol and limits of physical drives, the same issue that has struck in this past has evolved. With SATA 3.0 speeds being 6.0 Gigabits per second (Gbps), the throughput limit of SATA seems very distant from the fore mentioned 550MBps that SSDs today are capable of performing. However, when the conversion from Gbps to MBps is made, it is easy to see that SSDs are now hitting the limit of the SATA interface protocol. 6.0 Gbps converted to Gigabytes per second (GBps) is 0.75 GBps, or 750MBps, since there are 8 bits in a byte. Lastly, the SATA interface protocol uses an 8/10b encoding, which results in a net loss of 20% throughput [3]. So, 8/10th of 750MB/s is 600MB/s. This is the actual maximum throughput SATA is capable of doing, and now it is easy to see how SSD devices are slowly approaching this maximum limit.

As history starts to repeat itself, SAS and SATA have begun to hit their limits in terms of throughput while SSDs have closed the gap mechanical drives were unable to. Now it is time for a new interface protocol standard to emerge and luckily it has already been in development; the NVMe specification. The NVMe specification was designed by industry consortium of more than 80 different members and was led by Intel [6]. This large industry base ensures that newer companies attempting to get into disk design will conform to the NVMe specification. Instead of designing another storage interface protocol such as PATA or SATA, the NVMe specification decided to utilize an already present bus in all computers today that dramatically increases throughput and latency potential of all drives; the PCIe bus.

The benefits of choosing the PCIe bus as the standard are lengthy and detailed. Like other storage protocols, PCIe is also a full-duplex system and can support processes in random orders, as well as a queue of outstanding requests. Comparing the queue support of the NVMe specification to the Advanced Host Controller Interfaces (AHCIs), and ATA interfaces, such as SATA and PATA, NVMe is capable of handling 64k queues with 64k commands per queue, while AHCIs can only handle 1 queue with a maximum of 32 commands [8]. The PCIe bus is scalable down to a single (x1) lane and up to sixteen (x16) lanes, giving flexibility to the expansion of SSDs. With the newest PCIe 3.0 standard, each lane of PCIe allows for a throughput of 1000MB/s. Comparing this performance to the SATA 3.0 standard, that means that a single lane of PCIe already allows a greater throughput than the current SATA specification, and it can be expanded up to a value that is almost 30 times as large by using up to x16 lanes. With a 128/130b encoding versus the 8/10b encoding of SATA, less speed is lost in this conversion as well, making the PCIe interface more effective [7].

One of the major benefits of the PCIe bus compared to the SATA interface is its direct connection to the CPU subsystem, which results in much lower latencies due to the elimination of a host bus adapter (HBA). Where SSDs were able to have latencies around 30-50us, as mentioned before, moving to the PCIe bus allows for latencies to decrease further down to 15-20us [8]. As large queues begin to build in large data industries, the latency for each IO task adds up, and this decrease by a factor of 2 starts to become present. To use a real example, Amazon has been noted to lose 1% of sales for every 100ms the website takes to load [8]. Aside from just performance factors present in the PCIe bus, it is also designed to be low cost and power efficient. Because HBAs are not needed to connect drives directly to the PCIe lanes, less power is needed for this additional device, and less money needs to be invested into this additional device, driving cost down for users significantly.

Although many of the benefits mentioned above are true for Add-in-card (AIC) SSDs, not all of the details are required for drives in order to meet the NVMe specification. Although AICs make up a large portion of the NVMe devices that are generated, companies have managed to develop other form factors that still adhere to the NVMe specification, such as SFF-8639 (U.2) and M.2 Next Generation Form Factor (NGFF). These devices were generated with various systems in mind where an AIC might not always seem reasonable.

The U.2 connector, just like SAS, was designed with flexibility in mind and allows for companies to upgrade their framework before actually committing to purchasing drives to replace their old ones. The pinout for a U.2 connector is almost identical to that of SAS. Because of its identical features, just like SAS, it also supports SATA drives connected through it. In order for the U.2 connector to allow drives to meet the NVMe specification, however, it must also support PCIe connectivity [10]. The U.2 connector does this by filling the backplane of the connector with additional pins to allow up to four (x4) PCIe lanes to be brought out for one drive. It is because of this pinout that U.2 SSDS come in the standard 2.5” small form factor (SFF) that many other SSDs come in. Typically, most U.2 drives are connected via a U.2 to high density (HD) cable to a HBA in order to be recognized by the system.

The M.2 connector was designed with portability in mind, and is by far the smallest standard compared to the other two previous mentioned. M.2 devices are typically found in newer Ultrabooks that are highly limited in the space that they offer. As opposed to bulky 2.5” drives that are typically found in laptops, M.2 drives are only 22mm wide and range from 30mm to 110m in length. This decrease in size and change in standard offers laptops many advantages. Firstly, the decrease in size results in the ability to make laptops thinner, as well as lighter. Also, M.2 drives tend to require less power than its SATA counterpart, meaning battery life can be increased. The additional space saved from a M.2 drive can also mean a larger battery placed underneath the shell, which once again results in a larger battery life [9]. The M.2 devices have been very easy to implement into laptops since they plug right into a PCIe mini port, a design that has been used for Wi-Fi cards for several years now. The M.2 drives typically sport what is known as a “key”, which simply means a break in between the contact pins at a specific point. The most common keys for these devices are a B key, which has a 6 pin wide contact on the left, and an M key, which has a 5 pin wide contact on the right. M.2 drives may have one or both of these connectors. Each key represents the ability to support up to two (x2) PCIe lanes, so a device that has both has the capability of supporting up to four (x4) PCIe lanes.

As of writing this paper, NVMe is the newest commercially available protocol for interfacing storage, and is by far the fastest. With its large queue, higher throughput, and lower latency, it will not be a surprise when NVMe drives begin to dominate the market. Although the migration from older protocols to the PCIe bus has made a dramatic increase in performance, it is never too early to begin developing the next standard for when NVMe begins to become obsolete. Having the privilege of working at Teradyne, I have been able to witness this realization first hand when I, and a group of fellow engineers, attended the University of New Hampshire (UNH) Plugfest. Plugfests are events ran by the Interoperability Laboratory (IOL) up at UNH, which has been responsible for testing networking and data communications products since their initial launch in 1988 [11]. Due to their success, Intel has placed them in charge of all Interop and Conformance tests for the new NVMe specification. Essentially, each Plugfest is a consortium where a group of companies come together to test and solve mutual problems within their technological field. Each company who attends pays to become a member in hopes of getting their drive on the NVMe integrator’s list by passing the conformance and interop tests designed by UNH.

The conformance testing suite designed by UNH is a comprehensive test that ensures a device meets or exceeds the NVMe specification outline written by Intel. The interoperability testing suite designed by UNH is a test to ensure a device is compatible on a wide range of systems to ensure that it highly likely to work in almost any environment [11]. It is because of these definitions that a drive is only required to pass a conformance test at one station, while it is required to pass at up to six stations for interoperability standards. Because Teradyne is a company that focuses its design on testing equipment, they do not actually have a device that is required to meet NVMe specification, and thus do not have to run any conformance testing. It is because of this reason and the nature of Teradyne’s equipment that they were designated as a mandatory host station in which drive companies needed to pass interop tests with their drive in order to make it on the integrator’s list. Due to this reason, I was able to benchmark and compare dozens of drives across a short time span and get a better feeling of where the NVMe standard lies already.

Although many drives that came to our station suffered major firmware issues, thermal throttling troubles, and general connectivity issues, some of the drives that were presented were already showing signs of bottlenecking due to the PCIe bus. A few of the U.2 x4 drives that were tested were able to push speeds as high as 4,000MBps throughput. Based on my earlier description of the PCIe 3.0 standard, each lane is capable of outputting up to 1,000MBps, which means that these drives are already maximizing the throughput for the 4 lanes they are utilizing. Although the 4 lanes is only 1/4th of the maximum capability a device can borrow from the PCIe bus, it would be no surprise if the larger drives that emerged in the future showed a similar scalability and pushed past the bandwidth limits of PCIe 3.0.

Even though the NVMe specification has only recently been drafted and even more recently drives have been made commercially available, it is important to consider the next evolution in interface protocol to push past any bottlenecks. Currently, the only bus design that shows superior throughput and response time than the PCIe bus is the memory bus, which has always been used for holding volatile DIMM memory. If this bus could be repurposed to somehow hold large quantities of storage and allow it to remain non-volatile, the performance in storage would dramatically increase from what it is at today. Unsurprisingly, technology such as this is already in the process of design, and is a cooperative project among Intel Company and Micron known as 3D X-Point.

3D X-Point aims to solve a problem that has been present since Von Neumann originally discussed the architecture of computers and presents the first major breakthrough in memory design within the last 40 years, all while remaining inexpensive and non-volatile. Instead of using the same NAND design that traditional SSDs today do, 3D X-Point is believed to be a phase-change memory (PCM), although Intel and Micron have done a superior job in not having this information publically known. PCM is exactly as it sounds; it’s typically a material called chalcogenide (a very glass-like material), that has the ability to change its state from amorphous to crystalline (and vice-versa) based on the amount of electricity that is applied to the material [17]. The state change thus has a resistance that is associated with it, and this resistance is directly representative of the binary value that the bit holds. Because the cells are read and written simply by changing the voltage, the need for transistors is eliminated, which allows for cost to be driven down and the available space for density to be increased. Due to this change in material, the endurance, which is measured in read/write cycles per lifetime, is a factor of 1,000 times higher than today’s flash memory [15]. This material also allows for much better scaling than NAND flash; currently, NAND technology loses the ability to reliably function properly when cell size drops below 10nm, while PCM materials do not experience this issue. This detail will allow 3D X-Point to be significantly denser than NAND technology, allowing this cost to be driven down in the process even further. Because this newer technology will also be placed on the memory bus, which is significantly closer to the CPU than the PCIe bus or other interface protocols, the latency is expected to decrease by a factor of 1,000 as well [15]. With these two immense increases in performance and reliability, on top of the fact that this material still maintains the ability to be non-volatile, and can also be utilized in place of Random Access Memory (RAM) in much larger quantities, it is clear to see how this technology is considered such a major revolution.

Although Micron and Intel have done a phenomenal job in keeping the development of this technology a secret, there are a few ways in which these proposed numbers can be verified. PCM has been in development since approximately 2008, where Micron announced their first 128Mbit device. By 2012, Micron announced a volume production 1Gbit PCM device, and this prototype was able to undergo extensive testing. This PCM SSD prototype was a PCIe based device, which supported PCIe 2.0 speeds of 500MBps per lane. By running 5 million 4Kib read and write tests, it was concluded that this prototype device averaged about 6.7us of latency between read requests, and about 128us of latency between write requests [20]. Compared to the NVMe read speeds, this prototype device was actually able to run with a 3x smaller latency, albeit the write latency was significantly larger. Although these latencies are still a factor of a thousand times larger than DRAM latencies, it is plausible to see how a prototype technology from four years ago operating on a more latent and outdated version PCIe bus could have finally made the breakthrough to push the proposed speeds.

Another helpful comparison to realize the potential of 3D X-Point is to compare the proposed performance to the already implemented Double Data Rate (DDR) standard of memory. DDR is a synchronous dynamic random-access memory (SDRAM), which allows for faster computation due to its locality to the CPU. Currently, DDR is on its fourth generation of SDRAM and offers the highest reliability, availability, and serviceability of all generations. Comparing the newest DDR4 memory to the recently discussed PCIe x16 NVMe devices that offer up to 16GBps throughput and 20us latency, there is no competition between the potentials [22]. DDR4 RAM has shown about 45GBps throughput in worst case scenarios, and latencies of about 70ns [23]. Just by looking at these numbers alone, it is safe to conclude that SDRAM offers 3 times as much throughput potential and almost 1000x better latency than the current hard drives on the market. Although these numbers do not directly equate to 3D X-Point potential, it is evident through these two comparisons that it is highly plausible for this technology to meet the claims it has made.

To conclude the discussed material throughout this paper, CPU performance has almost always experienced a bottleneck due to the storage throughput and latencies that have been able to be achieved. CPU bandwidth has been able to double every 8 months, while storage protocols have only recently been able to break 1GBps. As hard drive design, speed, software, and density increased, SCSI and PATA protocols were eventually succeeded by SAS and SATA. With newer protocols, mechanical hard drives were unable to reach the maximum throughput allowed by these interfaces, and thus the SSD was born. SSDs quickly pushed past the limits of SATA and SAS, and now the NVMe specification has been written to discuss storage interface through the PCIe bus. Although the PCIe bus offers an immense improvement in latency and throughput over older interfaces, benchmarks have already shown the capability to start pushing past the limits of PCIe. Because of these large developments, the only reasonable move for storage interface will be directly on the memory bus to experience any increased throughput and latency. Luckily, Intel and Micron have already announced their development of such a technology, called 3D X-Point, which is based on a phase-changing material that will mimic the performance of SDRAM, all while maintaining greater than NAND densities and non-volatile properties. As big data continues to expand, the world becomes more interconnected, and CPU performance expands, the need for a revolution of storage interface is necessary, and the future of storage is right on the horizon with 3D X-Point.

REFERENCES

[1] Cheng, Wei, Tan, Zhenghua, Zhu, Zhiliang, and Chang, Guiran. "The Design of Serial ATA Bus  
 Control Chip" *Journal of Computers* [Online], Vol. 5 No. 4, pp. 524-532, April 2010

[2] M. T. LoBue, "Surveying today's most popular storage interfaces," in Computer, vol. 35, no. 12, pp.   
 48-55, Dec 2002.

[3] Krause, Michael. *The Solid State Storage (R-)Evolution*, *Storage Developers Conference,* Santa Clara, pp. 1-32, 2012

[4] Arie Tal. “Two Flash Technologies Compared: NOR vs NAND” Web. October 2002, pp. 1-10

[5] S. Cho, S. Chang and I. Jo, "The solid-state drive technology, today and tomorrow," 2015 IEEE 31st  
 International Conference on Data Engineering, Seoul, 2015, pp. 1520-1522.

[6] Intel Company, “NVM Express”, Revision 1.2, Web. November 2014.

[7] Y. Son, H. Kang, H. Han and H. Y. Yeom, "An Empirical Evaluation of NVM Express SSD," Cloud  
 and Autonomic Computing (ICCAC), 2015 International Conference on, Boston, MA, 2015,   
 pp. 275-282.

[8] Cobb, Danny, and Amber Huffman. "NVM Express and the PCI Express\* SSD  
 Revolution." *Intel Developer Forum,* pp. 1-43, 2012

[9] A. Huffman and D. Juenemann, "The Nonvolatile Memory Transformation of Client Storage,"   
 in Computer, vol. 46, no. 8, pp. 38-44, August 2013.

[10] Sivashankar and S. Ramasamy, "Design and implementation of non-volatile memory  
 express," Recent Trends in Information Technology (ICRTIT), 2014 International Conference on,  
 Chennai, 2014, pp. 1-6.

[11] IOL Interoperability Laboratory, “NVMe Integrator’s List”, Web. March 2016

[12] Bruce Nikkel, “NVM express drives and digital forensics” Digital Forensics 16, 2016, pp. 38-45

[13] S. Kim and C. H. Lam, "Transition of memory technologies," Proceedings of Technical Program   
 of 2012 VLSI Technology, System and Application, Hsinchu, 2012, pp. 1-3.

[14] Micron Technology, Inc, “Breakthrough Nonvolatile Memory Technology” *3D XPoint Technology*.   
 Web.

[15] Intel Company, "3D XPoint™ Unveiled-The Next Breakthrough in Memory  
 Technology." Web. 14 Apr. 2016.

[16] Intel Company, "Introducing Intel® Optane™ Technology – Bringing 3D XPoint™ Memory  
 to Storage and Memory Products" *Intel Newsroom*. Web. July 2015.

[17] Nelson, Fritz, and Paul Alcorn. "Intel-Micron 3D XPoint at Xroads." *Tom's Hardware*.   
 Web. Oct. 2015.

[18] Intel Company, “Fun Facts: How Fast and Robust is 3D XPoint™ Technology?” Web. July 2015

[19] Y. Liu, C. Zhou and X. Cheng, "Hybrid SSD with PCM," Non-Volatile Memory Technology  
 Symposium (NVMTS), 2011 11th Annual, Shanghai, 2011, pp. 1-5.

[20] Hyojun Kim, Sangeetha Seshadri, Clement L. Dickey, Lawrence Chiu, “Evaluating Phase Change  
 Memory for Enterprise Storage Systems: A Study of Caching and Tiering Approaches” *ACM  
 Transactions on Storage, Vol. 10, No. 4, Article 1*. October 2014, pp 15:1-15:21

[21] P. J. Nair, C. Chou, B. Rajendran and M. K. Qureshi, "Reducing read latency of phase change  
 memory via early read and Turbo Read," 2015 IEEE 21st International Symposium on High  
 Performance Computer Architecture (HPCA), Burlingame, CA, 2015, pp. 309-319.

[22] M. A. Islam, M. Y. Arafath and M. J. Hasan, "Design of DDR4 SDRAM controller," Electrical and  
 Computer Engineering (ICECE), 2014 International Conference on, Dhaka, 2014, pp. 148-151.

[23] Dustin Sklavos, “From DDR3 to DDR4: Bandwidth by the Numbers”, Web. September 2014